像素级别的2D对象语义理解是计算机视觉中的一个重要主题,可以帮助在日常生活中深入了解对象(例如功能和可折扣)。然而,最先前的方法直接在2D图像中的对应关系上培训,这是端到端,但在3D空间中失去了大量信息。在本文中,我们提出了一种关于在3D域中预测图像对应语义的新方法,然后将它们突出回2D图像以实现像素级别的理解。为了获得当前图像数据集中不存在的可靠的3D语义标签,我们构建一个名为KeyPointNet的大型关键点知识引擎,其中包含103,450个关键点和来自16个对象类别的8,234个3D模型。我们的方法利用3D视觉中的优势,并可以明确地理由对物体自动阻塞和可见性。我们表明我们的方法在标准语义基准上给出了比较甚至卓越的结果。
translated by 谷歌翻译
点云分析没有姿势前导者在真实应用中非常具有挑战性,因为点云的方向往往是未知的。在本文中,我们提出了一个全新的点集学习框架prin,即点亮旋转不变网络,专注于点云分析中的旋转不变特征提取。我们通过密度意识的自适应采样构建球形信号,以处理球形空间中的扭曲点分布。提出了球形Voxel卷积和点重新采样以提取每个点的旋转不变特征。此外,我们将Prin扩展到称为Sprin的稀疏版本,直接在稀疏点云上运行。 Prin和Sprin都可以应用于从对象分类,部分分割到3D特征匹配和标签对齐的任务。结果表明,在随机旋转点云的数据集上,Sprin比无任何数据增强的最先进方法表现出更好的性能。我们还为我们的方法提供了彻底的理论证明和分析,以实现我们的方法实现的点明智的旋转不变性。我们的代码可在https://github.com/qq456cvb/sprin上找到。
translated by 谷歌翻译
In this paper, we study the problem of knowledge-intensive text-to-SQL, in which domain knowledge is necessary to parse expert questions into SQL queries over domain-specific tables. We formalize this scenario by building a new Chinese benchmark KnowSQL consisting of domain-specific questions covering various domains. We then address this problem by presenting formulaic knowledge, rather than by annotating additional data examples. More concretely, we construct a formulaic knowledge bank as a domain knowledge base and propose a framework (ReGrouP) to leverage this formulaic knowledge during parsing. Experiments using ReGrouP demonstrate a significant 28.2% improvement overall on KnowSQL.
translated by 谷歌翻译
Time-series anomaly detection is an important task and has been widely applied in the industry. Since manual data annotation is expensive and inefficient, most applications adopt unsupervised anomaly detection methods, but the results are usually sub-optimal and unsatisfactory to end customers. Weak supervision is a promising paradigm for obtaining considerable labels in a low-cost way, which enables the customers to label data by writing heuristic rules rather than annotating each instance individually. However, in the time-series domain, it is hard for people to write reasonable labeling functions as the time-series data is numerically continuous and difficult to be understood. In this paper, we propose a Label-Efficient Interactive Time-Series Anomaly Detection (LEIAD) system, which enables a user to improve the results of unsupervised anomaly detection by performing only a small amount of interactions with the system. To achieve this goal, the system integrates weak supervision and active learning collaboratively while generating labeling functions automatically using only a few labeled data. All of these techniques are complementary and can promote each other in a reinforced manner. We conduct experiments on three time-series anomaly detection datasets, demonstrating that the proposed system is superior to existing solutions in both weak supervision and active learning areas. Also, the system has been tested in a real scenario in industry to show its practicality.
translated by 谷歌翻译
In this paper, we introduce a novel variation of model-agnostic meta-learning, where an extra multiplicative parameter is introduced in the inner-loop adaptation. Our variation creates a shortcut in the parameter space for the inner-loop adaptation and increases model expressivity in a highly controllable manner. We show both theoretically and numerically that our variation alleviates the problem of conflicting gradients and improves training dynamics. We conduct experiments on 3 distinctive problems, including a toy classification problem for threshold comparison, a regression problem for wavelet transform, and a classification problem on MNIST. We also discuss ways to generalize our method to a broader class of problems.
translated by 谷歌翻译
Text-to-SQL semantic parsing is an important NLP task, which greatly facilitates the interaction between users and the database and becomes the key component in many human-computer interaction systems. Much recent progress in text-to-SQL has been driven by large-scale datasets, but most of them are centered on English. In this work, we present MultiSpider, the largest multilingual text-to-SQL dataset which covers seven languages (English, German, French, Spanish, Japanese, Chinese, and Vietnamese). Upon MultiSpider, we further identify the lexical and structural challenges of text-to-SQL (caused by specific language properties and dialect sayings) and their intensity across different languages. Experimental results under three typical settings (zero-shot, monolingual and multilingual) reveal a 6.1% absolute drop in accuracy in non-English languages. Qualitative and quantitative analyses are conducted to understand the reason for the performance drop of each language. Besides the dataset, we also propose a simple schema augmentation framework SAVe (Schema-Augmentation-with-Verification), which significantly boosts the overall performance by about 1.8% and closes the 29.5% performance gap across languages.
translated by 谷歌翻译
Recently deep neural networks, which require a large amount of annotated samples, have been widely applied in nuclei instance segmentation of H\&E stained pathology images. However, it is inefficient and unnecessary to label all pixels for a dataset of nuclei images which usually contain similar and redundant patterns. Although unsupervised and semi-supervised learning methods have been studied for nuclei segmentation, very few works have delved into the selective labeling of samples to reduce the workload of annotation. Thus, in this paper, we propose a novel full nuclei segmentation framework that chooses only a few image patches to be annotated, augments the training set from the selected samples, and achieves nuclei segmentation in a semi-supervised manner. In the proposed framework, we first develop a novel consistency-based patch selection method to determine which image patches are the most beneficial to the training. Then we introduce a conditional single-image GAN with a component-wise discriminator, to synthesize more training samples. Lastly, our proposed framework trains an existing segmentation model with the above augmented samples. The experimental results show that our proposed method could obtain the same-level performance as a fully-supervised baseline by annotating less than 5% pixels on some benchmarks.
translated by 谷歌翻译
The robustness of Text-to-SQL parsers against adversarial perturbations plays a crucial role in delivering highly reliable applications. Previous studies along this line primarily focused on perturbations in the natural language question side, neglecting the variability of tables. Motivated by this, we propose the Adversarial Table Perturbation (ATP) as a new attacking paradigm to measure the robustness of Text-to-SQL models. Following this proposition, we curate ADVETA, the first robustness evaluation benchmark featuring natural and realistic ATPs. All tested state-of-the-art models experience dramatic performance drops on ADVETA, revealing models' vulnerability in real-world practices. To defend against ATP, we build a systematic adversarial training example generation framework tailored for better contextualization of tabular data. Experiments show that our approach not only brings the best robustness improvement against table-side perturbations but also substantially empowers models against NL-side perturbations. We release our benchmark and code at: https://github.com/microsoft/ContextualSP.
translated by 谷歌翻译
Given a natural language that describes the user's demands, the NL2Code task aims to generate code that addresses the demands. This is a critical but challenging task that mirrors the capabilities of AI-powered programming. The NL2Code task is inherently versatile, diverse and complex. For example, a demand can be described in different languages, in different formats, and at different levels of granularity. This inspired us to do this survey for NL2Code. In this survey, we focus on how does neural network (NN) solves NL2Code. We first propose a comprehensive framework, which is able to cover all studies in this field. Then, we in-depth parse the existing studies into this framework. We create an online website to record the parsing results, which tracks existing and recent NL2Code progress. In addition, we summarize the current challenges of NL2Code as well as its future directions. We hope that this survey can foster the evolution of this field.
translated by 谷歌翻译
The task of text-to-SQL is to convert a natural language question to its corresponding SQL query in the context of relational tables. Existing text-to-SQL parsers generate a "plausible" SQL query for an arbitrary user question, thereby failing to correctly handle problematic user questions. To formalize this problem, we conduct a preliminary study on the observed ambiguous and unanswerable cases in text-to-SQL and summarize them into 6 feature categories. Correspondingly, we identify the causes behind each category and propose requirements for handling ambiguous and unanswerable questions. Following this study, we propose a simple yet effective counterfactual example generation approach for the automatic generation of ambiguous and unanswerable text-to-SQL examples. Furthermore, we propose a weakly supervised model DTE (Detecting-Then-Explaining) for error detection, localization, and explanation. Experimental results show that our model achieves the best result on both real-world examples and generated examples compared with various baselines. We will release data and code for future research.
translated by 谷歌翻译